{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Local LLM\n",
    "\n",
    "Here, we'll use a server that's serving a local LLM.\n",
    "\n",
    "**Attention** This is OK for prototyping / dev usage, but should not be used for production cases when there might be concurrent requests from different users. As of the time of writing, Ollama is designed for single user and cannot handle concurrent requests see this issue: https://github.com/ollama/ollama/issues/358"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.prompts.chat import ChatPromptTemplate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langserve import RemoteRunnable\n",
    "\n",
    "model = RemoteRunnable(\"http://localhost:8000/ollama/\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's test out the standard interface of a chat model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "prompt = \"Tell me a 3 sentence story about a cat.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content=\"\\nSure! Here is a three sentence story about a cat:\\n\\nMittens the cat purred contentedly on the windowsill, basking in the warm sunlight. Suddenly, a bird perched nearby and Mittens' ears perked up, ready to pounce. With lightning quick reflexes, Mittens leapt into the air, but the bird had flown away, leaving Mittens to settle for just lounging in the sun once again.\")"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.invoke(prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AIMessage(content=\"\\nSure! Here is a three sentence story about a cat:\\n\\nMittens the cat purred contentedly on the windowsill, basking in the warm sunlight. Suddenly, a bird flew by, catching Mittens' attention and causing her to leap into action. With lightning quick reflexes, Mittens pounced on the bird, but it flew away just in time, leaving Mittens frustrated but still purring happily.\")"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "await model.ainvoke(prompt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Batched API works, but b/c ollama does not support parallelism, it's no faster than using .invoke twice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 7.65 ms, sys: 6.57 ms, total: 14.2 ms\n",
      "Wall time: 5.51 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[AIMessage(content='\\nSure! Here is a three sentence story about a cat:\\n\\nMr. Whiskers was a sleek black cat with bright green eyes. He spent his days lounging in the sunbeams that streamed through the living room window, chasing the occasional fly, and purring contentedly. Despite his lazy demeanor, Mr. Whiskers was always on the lookout for a warm lap to curl up in.'),\n",
       " AIMessage(content='\\nSure! Here is a three sentence story about a cat:\\n\\nMittens the cat purred contentedly on the windowsill, basking in the warm sunlight that streamed through the glass. Suddenly, a tiny bird perched on the ledge outside, tweeting nervously as it eyed the cat with suspicion. Without hesitation, Mittens pounced, snatching the bird in mid-air and devouring it in one quick motion.')]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "model.batch([prompt, prompt])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 9.72 ms, sys: 7.59 ms, total: 17.3 ms\n",
      "Wall time: 5.56 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "for _ in range(2):\n",
    "    model.invoke(prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[AIMessage(content=\"\\nSure, here's a three sentence story about a cat:\\n\\nMittens the cat purred contentedly on the windowsill, basking in the warm sunlight that streamed through the glass. Her bright green eyes sparkled as she watched a bird flit and flutter outside, wishing she could join it in its flight. Just then, her owner entered the room with a bowl of creamy milk, causing Mittens to jump down from the windowsill and rub against their legs in excitement.\"),\n",
       " AIMessage(content='\\nSure! Here is a three sentence story about a cat:\\n\\nMittens the cat purred contentedly on my lap, her soft fur a soothing balm for my frazzled nerves. As I stroked her back, she gazed up at me with big, round eyes, purring even louder. It was hard to resist the charm of this little ball of fluff, and I found myself smiling and scratching behind her ears.')]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "await model.abatch([prompt, prompt])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Streaming is available by default"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "|S|ure|,| here| is| a| |3| sentence| story| about| a| cat|:|\n",
      "|\n",
      "|M|itt|ens| the| cat| pur|red| content|edly| on| the| windows|ill|,| her| tail| tw|itch|ing| as| she| watched| the| birds| outside|.| Sud|den|ly|,| a| squ|ir|rel| sc|am|per|ed| by| and| Mitt|ens| was| on| high| alert|,| her| ears| per|ked| up| and| ready| to| p|ounce|.| With| light|ning| quick| ref|lex|es|,| Mitt|ens| jump|ed| from| the| windows|ill| and| ch|ased| after| the| squ|ir|rel|,| her| tail| streaming| behind| her|.||"
     ]
    }
   ],
   "source": [
    "for chunk in model.stream(prompt):\n",
    "    print(chunk.content, end=\"|\", flush=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "|The| cat| pur|red| content|edly| on| my| lap|,| its| soft| fur| a| so|othing| bal|m| for| my| fra|zz|led| n|erves|.| As| I| st|rok|ed| its| back|,| it| gaz|ed| up| at| me| with| soul|ful| eyes|,| p|urr|ing| loud|ly| in| appro|val|.| In| that| moment|,| all| was| right| with| the| world|.||"
     ]
    }
   ],
   "source": [
    "async for chunk in model.astream(prompt):\n",
    "    print(chunk.content, end=\"|\", flush=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And so is the event stream API"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'event': 'on_chat_model_start', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'name': '/ollama', 'tags': [], 'metadata': {}, 'data': {'input': 'Tell me a 3 sentence story about a cat.'}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content='\\n')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content='S')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content='ure')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=',')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' here')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' is')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' a')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' ')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content='3')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' sentence')}}\n",
      "{'event': 'on_chat_model_stream', 'run_id': '6c2bdfc1-d482-4861-886c-c737a50771c3', 'tags': [], 'metadata': {}, 'name': '/ollama', 'data': {'chunk': AIMessageChunk(content=' story')}}\n",
      "...\n"
     ]
    }
   ],
   "source": [
    "i = 0\n",
    "async for event in model.astream_events(prompt, version='v1'):\n",
    "    print(event)\n",
    "    if i > 10:\n",
    "        print('...')\n",
    "        break\n",
    "    i += 1"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
